Read Me

Please cite:

Andrew Piper and James Manalad, "Measuring Unreading," Goethe Yearbook 27 (2020).

Manalad was responsible for the collecting and OCRing of the Goethe Jahrbuch data along with the implementation of the text-reuse algorithm.

Andrew Piper was responsible for the collection of the Goethe corpus data, the analysis of the reuse data, and the writing of the article.


This folder contains the original text data used for the Goethe Jahrbuch and Goethe's corpus. 

It also contains the derived data of 3gram matches between corpora which readers can review for further insights / problems.

All code used to analyze the data is included in the goetheUnread.R file.

Derived data for the topic models and distinctive words are also included along with the code and parameters for deriving that data.

All tables and figures used in the article are also included here.